Skip to content

Conversation

@luffy-yu
Copy link
Contributor

@luffy-yu luffy-yu commented Dec 1, 2025

Motivation

  • The current Android LlamaDemo only supports XNNPACK.
  • The QNN-backend Android Demo is missing, the qnn_llama_runner approach is provided though.

Summary

  • Add Guidance for building and running Android LlamaDemo with QNN-Backend
  • It's verified on Samsung S23 (SoC SM8550).

Test plan

The APK file and the pre-built model can be found in its README.

Copilot AI review requested due to automatic review settings December 1, 2025 03:07
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 1, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16011

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 3865295 with merge base 12d17ef (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla
Copy link

meta-cla bot commented Dec 1, 2025

Hi @luffy-yu!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive documentation for building and running the Android LlamaDemo application with Qualcomm's QNN backend support. Previously, the Android demo only supported XNNPACK, and this guide fills that gap by providing step-by-step instructions verified on Samsung S23 hardware.

Key changes:

  • Added complete build and deployment guide for QNN-enabled Android LlamaDemo
  • Included verification steps to confirm correct AAR and APK configuration
  • Documented common issues and their solutions for QNN Android deployment

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings December 1, 2025 03:11
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 1, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings December 1, 2025 03:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +291 to +292
### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend

Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The external link points to 'meta-pytorch/executorch-examples' but the PR is in the 'pytorch/executorch' repository. Verify this is the correct repository reference for the LlamaDemo, as it may confuse users if the link is incorrect or if the demo has moved.

Suggested change
### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend
### Run [Android LlamaDemo](https://github.com/pytorch/executorch/tree/main/examples/llm/android/LlamaDemo) with QNN backend
> **Note:** If the LlamaDemo is not present in the `pytorch/executorch` repository, you can find it in the [meta-pytorch/executorch-examples](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) repository.

Copilot uses AI. Check for mistakes.
# Build the AAR
cd $EXECUTORCH_ROOT
export BUILD_AAR_DIR=$EXECUTORCH_ROOT/aar-out
./scripts/build_android_library.sh
Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script invocation lacks context about required environment variables or prerequisites. Consider adding a note about any required environment setup (e.g., Android NDK path, QNN SDK path) before running this script, or reference existing documentation that covers these prerequisites.

Copilot uses AI. Check for mistakes.
Comment on lines +328 to +329
adb push model.pte /data/local/tmp/llama
adb push tokenizer.bin /data/local/tmp/llama
Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation doesn't specify where to obtain 'model.pte' and 'tokenizer.bin'. Add a reference to instructions for generating or downloading these required files, especially since the QNN backend requires specific model compilation steps.

Copilot uses AI. Check for mistakes.
Comment on lines 348 to 350
# Check for debug strings in the AAR
unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \
strings | grep "YOUR DEBUG INFO"
Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The placeholder 'YOUR DEBUG INFO' is ambiguous and doesn't provide actionable guidance. Either provide a concrete example of what debug string to search for, or explain what type of debug information users should add to their code for verification purposes.

Suggested change
# Check for debug strings in the AAR
unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \
strings | grep "YOUR DEBUG INFO"
# Check for your custom debug string in the AAR.
# For example, if you added a debug string like "MyCustomDebugString" to your code,
unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \
strings | grep "MyCustomDebugString"
# Replace "MyCustomDebugString" with the actual debug string you added to your code.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings December 1, 2025 03:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


**Solution**:
1. Update `build.gradle.kts` with matching QNN runtime version

Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note appears after the 'Before' label but should appear before showing any code examples. Moving it above line 395 would make the documentation flow more clearly and prevent confusion about which versions to use.

Suggested change

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings December 1, 2025 03:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

unzip -l app/build/outputs/apk/debug/app-debug.apk | grep "libQnnHtp.so"
```

Expected size for QNN 2.37.0: ~2,465,440 bytes
Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The expected file size is overly specific and may vary slightly between builds or minor version updates. Consider changing this to a range (e.g., '~2.4-2.5 MB') to make the verification more resilient to minor variations.

Suggested change
Expected size for QNN 2.37.0: ~2,465,440 bytes
Expected size for QNN 2.37.0: ~2.4–2.5 MB (2,400,000–2,500,000 bytes)

Copilot uses AI. Check for mistakes.
**Solution**:
1. Update `build.gradle.kts` with matching QNN runtime version

> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.
Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The note about version numbers being examples appears after mentioning 'version X' and 'version Y' in the previous text but before showing the actual version numbers. This note would be more effective if placed immediately after the code examples (after line 403) where the specific versions are shown, or reference should be made to it in the text above where X and Y are mentioned.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings December 1, 2025 03:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 394 to 403
> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.
**Before**:
```kotlin
implementation("com.qualcomm.qti:qnn-runtime:2.33.0")
```

**After**:
```kotlin
implementation("com.qualcomm.qti:qnn-runtime:2.37.0")
```
Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbered list item '1.' is followed by a note and code examples, but there's no item '2.' at the same indentation level until line 405. The note and code examples should be indented or formatted as sub-items of item 1 to maintain proper list structure.

Suggested change
> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.
**Before**:
```kotlin
implementation("com.qualcomm.qti:qnn-runtime:2.33.0")
```
**After**:
```kotlin
implementation("com.qualcomm.qti:qnn-runtime:2.37.0")
```
> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.
**Before**:
```kotlin
implementation("com.qualcomm.qti:qnn-runtime:2.33.0")
```
**After**:
```kotlin
implementation("com.qualcomm.qti:qnn-runtime:2.37.0")
```

Copilot uses AI. Check for mistakes.
@cccclai
Copy link
Contributor

cccclai commented Dec 2, 2025

@haowhsu-quic can I get someone to review this PR? Or maybe suggestions regarding how to structure this

1. Update `build.gradle.kts` with matching QNN runtime version

> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.
**Before**:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is merged into quotation above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


##### Issue 1: Error 18 (InvalidArgument)

**Cause**: Wrong parameter order in Runner constructor or missing QNN config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be more readable if indentation could be applied to cause, symptoms and solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot AI review requested due to automatic review settings December 3, 2025 03:46
Copilot finished reviewing on behalf of luffy-yu December 3, 2025 03:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +324 to +330
***Step 5***: Push model

```bash
adb shell mkdir -p /data/local/tmp/llama
adb push model.pte /data/local/tmp/llama
adb push tokenizer.bin /data/local/tmp/llama
```
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 5 references model.pte and tokenizer.bin without explaining where these files come from or providing a reference to model generation instructions. Consider adding a note before Step 5 that links to the model export documentation (e.g., "See How to Support a Custom Model in HTP Backend for model generation instructions" or pointing to the Llama model export process specifically).

Copilot uses AI. Check for mistakes.
Comment on lines +295 to +332
***Step 1***: Rebuild ExecuTorch AAR

```bash
# Build the AAR
cd $EXECUTORCH_ROOT
export BUILD_AAR_DIR=$EXECUTORCH_ROOT/aar-out
./scripts/build_android_library.sh
```

***Step 2***: Copy AAR to Android Project

```bash
cp $EXECUTORCH_ROOT/aar-out/executorch.aar \
$DEMO_APP/app/libs/executorch.aar
```

***Step 3***: Build Android APK

```bash
cd $DEMO_APP
./gradlew clean assembleDebug -PuseLocalAar=true
```

***Step 4***: Install on Device

```bash
adb install -r app/build/outputs/apk/debug/app-debug.apk
```

***Step 5***: Push model

```bash
adb shell mkdir -p /data/local/tmp/llama
adb push model.pte /data/local/tmp/llama
adb push tokenizer.bin /data/local/tmp/llama
```

***Step 6***: Run the Llama Demo
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The step formatting is inconsistent with the existing documentation style. Steps 1-6 use ***Step X***: (with a colon), but the existing document uses ***Step X***. (with a period) as seen in lines 230 and 248. Please update the formatting to match the existing style by replacing colons with periods after the step numbers.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@haowhsu-quic haowhsu-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.

@cccclai cccclai merged commit 5b00d91 into pytorch:main Dec 4, 2025
146 of 147 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants